Development of Prediction Model for Linked Data based on the Decision Tree

نویسندگان

  • Dongkyu Jeon
  • Wooju Kim
چکیده

In this paper, we explain the detail analysis procedure of submission 1(Previous predicted results submission) of Task A1. We are trying to induce decision tree models to predict pc:numberOfTenders. Since the type of target attribute is non-negative integer value, we use the variance reduction as the attribute selection criteria. Input attributes are defined based on structure information of Public Contracts Ontology. We use the description logic constructors to properly represent a meaning of structure information of training data. Among all instances of the contract class, we make 10 different input data sets through random sampling method. The procedure of decision tree learning is performed by using SAS E-miner, and attribute selection criteria is variance reduction. Final prediction results of test data are the average of selected decision tree models except few models which have extremely low R-Square value.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Presenting a Model for Predicting Tax Evasion of Guilds Based on Data Mining Technique

In this research, considering the importance of the topic and the gap in previous researches, a model for predicting tax evasion of guilds based on data mining technique is presented. The analyzed data includes the review of 5600 tax files of all trades with tax codes in Qazvin province during the years 2013-2018. The tax file related to guilds is in five tax groups, including the guild group o...

متن کامل

Evaluation of liquefaction potential based on CPT results using C4.5 decision tree

The prediction of liquefaction potential of soil due to an earthquake is an essential task in Civil Engineering. The decision tree is a tree structure consisting of internal and terminal nodes which process the data to ultimately yield a classification. C4.5 is a known algorithm widely used to design decision trees. In this algorithm, a pruning process is carried out to solve the problem of the...

متن کامل

Comparison of gestational diabetes prediction with artificial neural network and decision tree models

Background: Gestational diabetes mellitus (GDM) is one of the most common metabolic disorders in pregnancy, which is associated with serious complications. In the event of early diagnosis of this disease, some of the maternal and fetal complications can be prevented. The aim of this study was to early predict gestational diabetes mellitus by two statistical models including artificial neural ne...

متن کامل

Steel Buildings Damage Classification by damage spectrum and Decision Tree Algorithm

Results of damage prediction in buildings can be used as a useful tool for managing and decreasing seismic risk of earthquakes. In this study, damage spectrum and C4.5 decision tree algorithm were utilized for damage prediction in steel buildings during earthquakes. In order to prepare the damage spectrum, steel buildings were modeled as a single-degree-of-freedom (SDOF) system and time-history...

متن کامل

Personal Credit Score Prediction using Data Mining Algorithms (Case Study: Bank Customers)

Knowledge and information extraction from data is an age-old concept in scientific studies. In industrial decision-making processes, the application of this concept gives rise to data-mining opportunities. Personal credit scoring is an ever-vital tool for banking systems in order to manage and minimize the inherent risks of the financial sector, thus, the design and improvement of credit scorin...

متن کامل

Using Combined Descriptive and Predictive Methods of Data Mining for Coronary Artery Disease Prediction: a Case Study Approach

Heart disease is one of the major causes of morbidity in the world. Currently, large proportions of healthcare data are not processed properly, thus, failing to be effectively used for decision making purposes. The risk of heart disease may be predicted via investigation of heart disease risk factors coupled with data mining knowledge. This paper presents a model developed using combined descri...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014